Molecular Systems Biology
○ Springer Science and Business Media LLC
Preprints posted in the last 30 days, ranked by how well they match Molecular Systems Biology's content profile, based on 142 papers previously published here. The average preprint has a 0.06% match score for this journal, so anything above that is already an above-average fit.
Jiang, Y.; Movassaghi, C. S.; Munoz-Estrada, J.; Sundararaman, N.; Momenzadeh, A.; Meyer, J. G.
Show abstract
Large-scale mass spectrometry-based proteomic screening could reveal cellular mechanisms of drug action at systems resolution but remains limited by experimental complexity and the difficulty of extracting insight from high-dimensional datasets. Here, we describe an end-to-end platform that combines semi-automated sample preparation, rapid LC-MS/MS, and AI agent-based data analysis to enable scalable proteomic screening. In a screen of 172 compounds in HepG2 cells, we generated 1,232 proteomes with more than 8,700 quantified proteins in approximately three weeks. Agentic AI reduced data analysis and interpretation time to less than one day while translating proteomic measurements into structured mechanism-oriented summaries and experimentally testable hypotheses. Guided by this framework, we validated: (1) a cholesterol-lowering effect of methylene blue in vitro and (2) an association between loratadine exposure and increased circulating iron in matched electronic health record analyses. This work establishes a scalable platform for generating proteomic drug perturbation data and automatically converting that data into mechanistic insights and candidate translational hypotheses using AI.
Goel, M.; Nissley, D. A.; Castellanos-Girouard, X.; Kuntz, C. P.; Wang, Y.; Mukhtar, M. S.; Serohijos, A.; Schlebach, J. P.
Show abstract
The propensity of proteins to form oligomers is ultimately dictated by their structural configuration(s). Proteins that persist in a discrete conformational state may form a limited number of specific interactions while those that sample a broader structural ensemble may instead associate with a wider array of partners. These intrinsic tendencies potentially constrain the way proteins navigate wider interaction networks. In this work, we aggregated and surveyed a wide variety of biophysical, biochemical, and cellular descriptors of the S. cerevisiae proteome to identify biases in the connectivity of its protein-protein interaction network. Using mass spectrometry-based interactome measurements and various protein stability estimates, we find that a disproportionate number of abundant, yet unstable binding proteins act as network hubs. Moreover, we show that these features alone can be used to discriminate between hubs and non-hub proteins with high accuracy (AUROC = 0.898). Interestingly, we find that half-lives of hub proteins depend on whether or not they reside within static complexes and/ or whether they interact with molecular chaperones. Finally, we note that the observed connectivity biases associated with abundant, unstable proteins only pertain to network hubs, but not to the bottlenecks that connect them. Together, our findings reveal how the conformational stability of a protein may constrain its context within protein-protein interaction networks.
Devlin, L.; Oudard, V.; Barthe, M.; Gosselin-Monplaisir, T.; Dupin, J.-B.; Condamine, F.; Baudry, J.; Cocaign-Bousquet, M.; Millard, P.; Enjalbert, B.
Show abstract
The long-held view that acetate, one of the main fermentation by-products of Escherichia coli, is toxic to microbial growth is currently challenged. Here, we demonstrate that acetate promotes E. coli adaptation to nutrient changes by accelerating growth resumption, with as little as 250 {micro}M acetate being sufficient to shorten the lag phase by several hours. Acetate was found to be consumed via acetyl-CoA synthetase very early after the nutrient change. Transcriptomics, metabolomics and 13C-isotope labeling experiments show that acetate replenishes metabolic pools in the tricarboxylic acid cycle and upper glycolysis. Single-cell analyses reveal that acetate increases the adaptation speed of individual cells switching to the new nutrient. We conclude that the reuse of excreted acetate by E. coli facilitates metabolic adaptation by transiently replenishing central metabolite pools. This work identifies an unexpected role of acetate in the nutritional adaptation of E. coli, providing new insights into the physiological relevance of overflow metabolism. HighlightsO_LIAcetate facilitates E. coli adaptation from one nutrient to another. C_LIO_LILess than 250 {micro}M acetate is sufficient to halve lag times. C_LIO_LIAcetate helps replenish metabolite pools in central carbon metabolism. C_LIO_LIAcetate excretion is an adaptative strategy to overcome resource fluctuations. C_LI
Liu, Y.; Coles, A. M.; Castiglione, J.; Venu Thiyagarajan, V.; Clifton, K.; Goyal, D.; Wu, J.; Sheridan, A.; Vujic, A.; Harris, K. M.; Manor, U.; Pereira, T. D.; Fan, J.; Lee, R. T.; Kosuri, P.
Show abstract
Resilience to cardiac stress is essential for health, yet the relationship between cardiomyocyte (CM) stress response and local microenvironment remains unclear. Here, we combined MERFISH spatial transcriptome profiling with Cellouette, an improved cell segmentation method, to determine CM-microenvironment relationships in a mouse model of ventricular pressure overload. We report the shape, transcription profile, spatial organization, and physical connectivity for >400,000 cells across stressed and healthy tissues. Under stress, CMs adopted a spectrum of emergent transcriptional states, with advanced states marked by a metabolic and pro-fibrotic shift. To discover CM-environment relationships, we performed a network analysis of physical cell connectivity combined with cell-type-specific profiling. We found that pro-fibrotic CM progression was tightly linked to distinct local microenvironments, and CM metabolic shifts could be inferred from transcriptional patterns in neighboring non-CM cells, revealing microenvironmental imprints of disease. We thus provide a resource for understanding the heterogeneity of outcome during cardiac pressure overload. HighlightsO_LICellouette provides accurate segmentation for single-cell spatial transcriptomics in cardiac tissue. C_LIO_LIPressure overload creates spatial gradients of cardiomyocyte pro-fibrotic states. C_LIO_LICardiomyocyte pro-fibrotic progression is linked to changes in local cell composition and gene expression. C_LIO_LITranscriptional states of non-muscle cells predict metabolic state of adjacent cardiomyocytes. C_LI
Welter, A. S.; Mutschler, F.; Simon, M.; Giacomelli, C.; Branscheid, A.-C.; Manukyan, A.; Teixeira Alves, L. G.; Gerwien, M.; Kerridge, R.; Landthaler, M.; Wolf, J.; Selbach, M.
Show abstract
Even cells of the same type growing in the same environment show cell-to-cell differences in protein abundance, a phenomenon known as gene expression noise. This variability can be decomposed into intrinsic components, reflecting molecular randomness, and extrinsic components, arising from differences in cellular state. While gene expression noise has been studied genome-wide in microbes, its global organization remains largely unknown in mammalian cells. Here, we develop a spike-in-based stable isotope single-cell proteomics approach that enables robust quantification of protein-level gene expression noise across thousands of human proteins. We find that protein noise scales inversely with abundance until reaching a plateau, consistent with an extrinsic noise floor and conserved scaling principles observed in bacteria and yeast. Cell cycle stage and cell size contribute substantially to protein variability but do not fully account for the observed heterogeneity. Gene-specific features such as mRNA and protein half-lives and translation efficiency show only weak associations with protein noise, and variability at the mRNA level is a weak predictor of protein variability. Instead, protein noise is largely extrinsic, with coordinated variation across proteins encoding biologically organized cellular states. Consistently, coordinated proteome programs predict intercellular differences in proteome dynamics, linking protein variability to cellular function. Together, these results provide a proteome-wide view of gene expression noise in mammalian cells, establishing that protein-level variability encodes structured and functionally relevant differences in cellular state.
Fenn, A.; Hueckelhoven, R.; Kamal, N.
Show abstract
Dual-organism RNA sequencing (RNA-seq) experiments, in which the transcriptomes of a host and a microbe are sequenced simultaneously, are increasingly used to study plant-microbe interactions. A central analytical goal is identifying effector proteins and their host targets through gene co-expression. Weighted Gene Co-expression Network Analysis (WGCNA) is the dominant tool for gene co-expression analyses, yet its ability to recover interaction-interface genes from a merged dual-organism matrix has not been systematically characterised. Here we present a simulation framework using real gene models from Hordeum vulgare (barley) and Blumeria graminis f. sp. Hordei M.Liu & Hambl (powdery mildew) to evaluate single-network WGCNA across a gradient of plant-to-fungal library size ratios (1:1-20:1), three levels of co-expression signal strength, and three WGCNA network construction types (signed, unsigned, signed hybrid). We embed 20 model effector genes (bridge genes) driven by a mixed host-pathogen eigengene and evaluate recovery using four metrics aligned with the biological objective: cross-species hub rank, top-decile hub enrichment, bridge gene detection rate, and bridge co-separation (the fraction of effector-target pairs co-assigned to the same detected module). Across 225 simulation runs (15 conditions x 5 replicates x 3 network types), bridge genes are robustly identifiable as cross-species connectivity hubs (mean rank 0.92 versus 0.50 for module genes) but co-assignment of effector-target pairs to the same module fails in 41% of runs due to scale-free topology collapse. Signal strength (2 = 0.12) and library ratio (2 = 0.22) are the primary determinants of co-separation, while network type choice accounts for less than 2%. A read-depth bias systematically inflates pathogen gene hub ranks relative to host genes at high ratios. These results establish that the method can identify effector candidates as cross-species hubs under a broad range of conditions, but reliable co-assignment requires adequate pathogen read depth and strong co-expression signal--properties that experimental design, not analytical parameterisation, must provide.
Awasthi, D.; Verma, P.; Pandit, S. B.
Show abstract
Alternative splicing (AS) expands transcriptome and proteome diversity by differentially combining exons or their splice variants. Although RNA-seq studies have uncovered transcriptomic variability, understanding the corresponding protein-level diversity remains limited. Mass spectrometry-based proteomics provides protein-level insights through MS/MS peptide annotations, which are mostly linked to gene/transcript or UniProt identifiers. However, tracing them to specific isoforms remains challenging due to the lack of exon mapping or inconsistent annotations. We developed PEXMap (PeptideEXonMapper), a k-mer-based proteogenomic framework that systematically maps MS/MS peptides to genes, transcripts, exons, or exon-exon junctions by exact matching of unique 8-mers derived from MS/MS peptides to those in reference databases from exon-resolved isoforms. Comparing PEXMap mappings of human proteome from PeptideAtlas showed annotation concordance with it. Applying PEXMap to liver and pancreas proteomes, we identified tissue-specific isoform expression and, similarly, annotated the cancer proteome. PEXMap reliable mappings could provide insights into role of AS in shaping proteomes across tissues and disease states. Source code is publicly available for download at GitHub: https://github.com/deepanshicbg/PEXMap and supported on Linux.
Farinas, M.; Bermudez, V.; Tsirvouli, E.; Zobolas, J.; Aittokallio, T.; Lehti, K.; Flobak, A.; Lippestad, K.
Show abstract
Effective drug combination therapies can improve cancer treatment, yet the mechanistic basis of drug synergy remains poorly understood. Most computational approaches prioritize predictive accuracy over molecular mechanistic interpretability, providing hence limited insights into how synergistic effects emerge across signalling contexts. We developed Trafikk, a molecular-signalling network-based framework that simulates drug perturbations in cell line-specific computational models to mirror functional outcomes in experimental combination screens. Across two independent large-scale datasets, Trafikk identified synergistic combinations with >77% recall. Functional response predictions revealed both conserved and context-dependent mechanisms. While AKT-MEK co-inhibition consistently disrupted coordinated survival and apoptotic signalling in 742 cell lines, PI3K-BCL2 synergy arose through distinct death programs shaped by cell-context-specific network constraints. Trafikk combines predictive performance with mechanistic interpretability, capturing how and why drug synergy emerges across cellular contexts. Source code, installation instructions and usage tutorial are freely available at https://github.com/druglogics/trafikk. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=147 SRC="FIGDIR/small/723755v1_ufig1.gif" ALT="Figure 1"> View larger version (33K): org.highwire.dtl.DTLVardef@159ca61org.highwire.dtl.DTLVardef@1f5ccecorg.highwire.dtl.DTLVardef@60d56eorg.highwire.dtl.DTLVardef@15c3021_HPS_FORMAT_FIGEXP M_FIG C_FIG
White, F.; van der Ploeg, G. R.; Heintz-Buschart, A.; Dong, L.; Bouwmeester, H.; Smilde, A.; Westerhuis, J.
Show abstract
In multi-block data, the dominant sources of variation are not always most relevant to a response of interest, meaning that purely exploratory decompositions may fail to recover subtle but important response-associated structure. We introduce PESCAR, a supervised extension of Penalised Exponential Simultaneous Component Analysis (PESCA) that incorporates response information directly into the estimation of common, local, and distinct (CLD) structure across multiple data blocks. This allows simultaneous multiblock decomposition and response variable influenced recovery of latent structure. Through simulation studies, we show that PESCAR can detect weak response-related components across a range of settings, including different noise levels and model-rank mis-specification. Applied to a real multi-omics dataset, PESCAR recovers biologically meaningful response-associated patterns and retains interpretable block structure. We further demonstrate that sparsity in the fitted loading matrices admits a hypergraph-based interpretability layer, summarising overlapping support patterns across components and blocks. These results show that direct incorporation of response information into multiblock decomposition can improve detection of subtle relevant signal and facilitate interpretation in complex systems.
Bajzik, J.; Depope, A.; Zolfimoselo, Y.; Sharipov, A.; Lesayova, A.; Klein, H.; Richmond, A.; Vernardis, S.; Grauslys, A.; Andrejev, S.; Zelezniak, A.; Ralser, M.; Marioni, R.; Mondelli, M.; Robinson, M. R.
Show abstract
The incidence of the vast majority of neurodegenerative, cancer, and metabolic diseases generally increases exponentially with age. In large-scale biobanks, linking time-to-diagnosis information in electronic health records to multiple genomic ("multiomics") measures has the potential to reveal the genes and biological pathways involved in the disease onset and progression. To date, association testing has commonly been conducted by testing one variable at a time using semiparametric Cox proportional hazards (CoxPH) models, which ignores correlation structure and increases the risk of false discoveries. To address these issues, we introduce a novel fully parametric Bayesian computational method, vampW, based on the Vector Approximate Message Passing framework applied to a Weibull model. vampW jointly models correlated features, while providing an interpretable hazard structure, producing a continuous survival curve, and incorporating prior knowledge. In an extensive simulation study, we demonstrate that joint modeling of omics data and time-to-event outcomes with vampW, substantially reduces false discoveries in comparison to marginal testing and other forms of joint CoxPH models. In 53,018 individuals from the UK Biobank, vampW identifies 219 protein associations with 24 disease outcomes, most of which are not among the top marginal discoveries. We further correct protein levels for exponential age effects, identifying 1,308 associations and highlighting the sensitivity of the analysis to age-correction methodology. Our findings replicate in independent cohorts using different measurement technologies, within data from Iceland and a novel Generation Scotland proteomics dataset. vampW also achieves significant improvement in the prediction of disease onset times: across 14 outcomes, it reduces the root mean squared error by over 32% and 26%, when compared to CoxPH variants and the deep learning approach DeepSurv, respectively, while maintaining predictive utility in minority populations. In summary, vampW offers accurate and interpretable variable selection and out-of-sample prediction within a single computational framework, making it a powerful tool for dissecting the genomic architecture of common complex disease onset.
Nguyen, T.; H. Ho, B.; Pan, M.; Flegg, J. A.; MCDONALD, M.; Drovandi, C.
Show abstract
MotivationKinetic models are central to systems biology, but enzyme-kinetic parameters compiled from the literature and databases are often incomplete, inconsistent, and measured under heterogeneous conditions. Classical parameter balancing helps infer missing parameters, yet it often lacks calibrated uncertainty, robustness to misspecification, and explicit treatment of source-level heterogeneity. ResultsWe develop a formal Bayesian parameter balancing framework that enforces thermodynamic constraints, estimates full posterior uncertainty, and validates calibration using leave-one-out cross-validation and posterior-predictive coverage. Beyond the classical Gaussian formulation, we introduce robust Student-t and skewed error models to improve reliability under outliers and model misspecification, and incorporate random effects to account for source or group variability across studies. The resulting approach yields thermody-namically consistent parameter sets with well-calibrated credible intervals on held-out data, offering a Bayesian parameter balancing approach useful to systems biology researchers.
Chan, J. K.; Ly, N. S.; Taverniti, O.; Gwynne, W. D.; Lieng, B. Y.; Affe, V.; Urquhart-Cox, V. T.; Alonzi, S. M.; Muhundan, M.; Denhart, A. J.; Edgar, L. J.; Quaile, A. T.; Montenegro-Burke, J. R.
Show abstract
Despite the emergence of cellular atlases like the Human Protein Atlas, no equivalent atlas exists for the human metabolome. Here, we present the Human Metabolome Atlas (HMA, hma.ccbr.utoronto.ca), a comprehensive map containing metabolomic profiles of 70 human cell lines across 22 tissues. With an [~]8-fold increase in coverage compared to other resources, the HMA contains quantitative data for 1768 metabolites at the highest identification confidence, encompassing over 50 lipid classes and a broad range of metabolic pathways. This constitutes the most extensive human metabolomic atlas available. Leveraging the HMA, we identified specific metabolic regulation within pathways and cell types and characterized metabolic processes like glycosylation and ferroptosis. Lastly, we developed a publicly available, interactive web-portal to facilitate custom data analysis for the broader scientific community.
Tindle, C.; Penrose, H. M.; Sinha, S.; Mullick, M.; Carpio-Perkins, K.; Hayashi, M.; Carpinelli, S. S.; Mclaren, E.; Hsieh, C.-C.; Zablan, K.; Le, H. N.; Neill, J.; Katkar, G. D.; Sandborn, W. J.; Boland, B.; Ghosh, P.
Show abstract
Inflammatory bowel diseases (IBD) remain a relapsing, treatment-refractory disorder marked by progressive tissue injury and inflammation despite expanding immune-targeted therapies. We established a prospective cohort integrating stromal biobanking, functional phenotyping, cross-cohort benchmarking, and outcome modeling to define disease-anchored cellular states. Colonic myofibroblasts from 34 individuals spanning health, ulcerative colitis, and Crohns disease resolved into two dominant states: inflammatory (IMFs) and quiescent (QMFs) myofibroblasts. IMF predominance at recruitment forecasted progressive disease, increasing odds of worsening endoscopic severity despite therapy escalation by [~]4.6 during follow-up, thereby linking early stromal biology to clinical endpoints. Unlike QMFs, IMFs exhibited a senescence-associated secretory phenotype that impaired epithelial stemness, barrier integrity, and innate immune fitness. State-guided prioritization identified EDNRB-antagonism as a high-confidence stromal intervention, reversing pathogenic phenotypes across orthogonal assays and species. Outcome simulation positioned stromal-state reversibility by EDNRB-antagonism as a precision axis, reducing odds of recalcitrance by [~]96.4% and reframing treatment resistance as a reversible stromal state. Graphic Abstract O_FIG O_LINKSMALLFIG WIDTH=194 HEIGHT=200 SRC="FIGDIR/small/720931v1_ufig1.gif" ALT="Figure 1"> View larger version (39K): org.highwire.dtl.DTLVardef@194d546org.highwire.dtl.DTLVardef@3e16a9org.highwire.dtl.DTLVardef@41dd7borg.highwire.dtl.DTLVardef@33b972_HPS_FORMAT_FIGEXP M_FIG C_FIG In this work, Tindle et al. identify reversible stromal states that govern recalcitrant IBD and nominate precision reprogramming of pathogenic myofibroblasts as a new therapeutic strategy.
Rasmi, D. S.; Krishnan, J.; Hashem, Y. A.; Palsson, B.; Khashef, M. T.; Monk, J.; Aziz, R. K.
Show abstract
Enterococci are Gram-positive opportunistic pathogens responsible for a wide range of nosocomial infections. One enterococcocal species, Enterococcus faecium, is steadily increasing in prevalence and has been listed among major multidrug-resistant ESKAPE pathogens. To gain systems-level insights into its metabolism and support discovery of potential therapeutic targets, we constructed iDR479, a comprehensive manually curated genome-scale metabolic model (GEM) to serve as a digital twin for E. faecium TX0016 (strain DO). The reconstruction was curated through extensive homology searches and literature evidence, and further refined and gap-filled through experimental validation. Phenotypic profiling using Biolog microarrays enabled assessment of carbon source utilization, while amino acid leave-out growth assays allowed the evaluation of auxotrophies. The final refined model is 100% accurate in predicting amino acid auxotrophy and 85% accurate in predicting growth on sole carbon sources. Discrepancies between model predictions and experimental phenotypes identified specific knowledge gaps across metabolic pathways, including unresolved carbon source utilization phenotypes, e.g., psicose, sorbitol, and palatinose utilization. Those gaps will guided future experimental characterization. Additionally, gene essentiality analysis was conducted to evaluate the predictive capacity of iDR479 model. Since no experimental gene essentiality data are currently available for E. faecium, model predictions were compared against Tn-seq experimental results from E. faecalis MMH594. Under simulated rich medium conditions, iDR479 achieved 86.7% concordance with the experimental essentiality results of E. faecalis MMH594. iDR479 thus provides a framework for studying E. faecium, offers insights into its metabolic network, and serves as a source for guiding future research and identification of therapeutic targets.
Katic, I.; Papasaikas, P.; Gaidatzis, D.; Grosshans, H.
Show abstract
Multicopy transgene arrays remain widely used in C. elegans research. It is usually assumed that they behave neutrally, not impacting the phenotype under investigation. Here, we reveal that a previously reported heterochronic extra-molt phenotype associated with myrf-1(mg412) depends on the presence of the integrated molting reporter mgIs49. Nanopore long-read sequencing shows that mgIs49 is a massive 8.8-Mb insertion - around 50% of the size of its host chromosome - which disrupts the prmt-9 gene. Both mgIs49 and another array, maIs105, cause dysregulation of the transcriptome and accumulation of reads mapping to the promoter sequences used as components of the array. We identify additional arrays exceeding 4 Mb and show that variable molting defects occur across different transgenic lines when combined with myrf-1(mg412), implicating array size or composition in the synthetic phenotype. Our results underscore the necessity of replacing multicopy reporters in developmental studies with single-copy insertions or endogenous tagging whenever possible.
Myers, S. A.; Vasquez Castro, F.; Sanchez Solis, L. D.
Show abstract
MotivationPost-translational modifications (PTMs) are critical to protein function, yet the function of most known modification sites remains uncharacterized. CRISPR-mediated phenotypic screens using base editors offer a powerful approach to dissecting PTM function at scale. However, existing sgRNA design tools for base editing applications are DNA-centric and lack the throughput required to integrate seamlessly with mass-spectrometry-based proteomics experimental outputs. ResultsWe introduce protein editing in R, PrEditR, an open-source, protein-centric tool for high-throughput sgRNA design for custom base editor screens. PrEditR enables users to designate specific amino acid residues in proteins and design protospacer sequences to target the endogenous gene to install missense mutations via base editors. Availability and ImplementationPrEditR is available on GitHub and Docker Hub.
Oesinghaus, L.; Park, M.; Shao, R.; Koh, P. W.; Seelig, G.
Show abstract
Cytokine biology is dispersed across hundreds of thousands of publications, making it difficult to use systematically when interpreting new experiments. Large language models (LLMs) can assist with focused literature interpretation, but ad hoc retrieval remains incomplete and unreliable. We present the Cytokine Effect Database (CytED), a framework for interfacing user-supplied experimental datasets with literature knowledge at scale. CytED uses a multi-step LLM pipeline to generate over a million cytokine-cell type-effect triples from 110,000 full-text publications, with annotations for experimental context and directional changes in genes, pathways, and cellular processes. This structure enables quantitative comparison between observed perturbation responses and prior literature across cytokines, cell types, and experimental contexts. Applied to in vitro IL-10 stimulation of PBMCs, CytED identifies unexpected pro-inflammatory features in monocytes and systematic in vivo-in vitro differences in cytotoxicity responses in CD8+ T cells. CytED infers cytokine signaling, distinguishes primary from secondary cytokine effects, and guides the design of combinatorial perturbation screens. Together, CytED establishes a general paradigm for converting unstructured domain literature into analytical tools that bridge literature and experiment.
Stevenson, E.-L.; Kelliher, C. M.; Kettenbach, A. N.; Loros, J. J.; Dunlap, J. C.
Show abstract
Circadian rhythms, [~]24-hour biological cycles, enable organisms to anticipate rhythmic environmental cycles so they can assign proper day and night functions that align with those cycles. Circadian rhythms are defined by their ability to be reset by external cues, their capacity to continue to oscillate in the absence of those cues, and their capacity to maintain the rate of the clock across a range of ambient temperatures, a property known as temperature compensation. In the Neurospora clock, the White Collar Complex (WCC) drives expression of FRQ which nucleates a complex including FRH and CK1a that phosphorylates and thereby represses WCC activity. Work to date has suggested that kinases may be involved in temperature compensation and that in Neurospora the primary target of these is FRQ. Here we investigate the genetic relationship between two clock kinases, Casein Kinase I (ck-1a) and Casein Kinase II (cka), in their regulation of temperature compensation using novel alleles, ck-1aD135G and {Delta}cka. We find that that the clock relies on Casein Kinase I more at cold temperature, but this changes as temperature increases, and the clock relies more on Casein Kinase II at warm temperatures. Using quantitative proteomics on FRQ across temperatures, we find that the FRQ phosphorylation landscape is dependent on temperature and is altered in temperature compensation mutants. This leads to the development of a phosphorylation driven model for temperature compensation, where key temperature compensation specific domains on FRQ are phosphorylated to regulate period length in response to temperature, including by Casein Kinase I and Casein Kinase II.
Karagöl, T.; Karagöl, A.
Show abstract
Alternative splicing and proteolytic truncation generate tens of thousands of protein isoforms in the human proteome, but the structural consequences for quaternary state, the level at which most signaling, enzymatic and regulatory function operates, have largely been examined one molecule at a time. Leveraging the recent expansion of the AlphaFold Database to predicted human homodimers, we systematically compared 5,168 canonical-versus-truncated homodimer pairs across the human proteome. In high-confidence canonical homodimers, truncation is associated with predicted structural conservation in 56.4% of pairs (mean 85 residues lost), complete interface ablation in 26.1% (mean 178 residues lost), and partial destabilization in 17.5% (mean 134 residues lost); a distinct fourth class (4.0% of the dataset, n = 208) shows truncation-associated emergence of a predicted high-confidence interface from a sub-threshold canonical baseline. Two reproducible rules govern these transitions: a topological asymmetry in which N-terminal losses are preferentially enriched [~]1.6-fold in interface preservation while C-terminal losses are rare overall ([~]6% of pairs) and modestly under-represented in the conservation class, and a biophysical rule in which emergence-class proteins show substantially elevated intrinsic disorder content relative to ablation-class proteins, as measured by both AlphaFold pLDDT-defined disorder of the canonical structure (Cohens d {approx} 1.39) and AIUPred peak binding propensity of the truncated isoform (Cohens d {approx} 0.65). Formal pathway enrichment recovered only a small nucleotide-metabolism signal, indicating that these rules operate across diverse gene-functional categories. Truncation-associated remodeling of homodimer architecture thus constitutes a structural grammar of the human proteome rather than a specialty of any single regulatory family.
Kalra, S.; Sanchez, G.; Stubin, A.; Le, A.; Bakshian, A.; Ortiz Diaz, B.; Mark, B. M.; Pena, C.; Parker, E.; Johnston, E.; Hsu, E.; Brangham, G.; Bala-Mehta, I.; Perez, L.; Milrod, M.; Stanten, M.; Nakamura, M.; Hwang, P.; Ptaszynska, S.; Cander, S.; Park, S.; Tan, T. L.; Zhou, Y.; Coolon, J.
Show abstract
Gene-by-environment (GxE) interactions play a major role in shaping both phenotypic and molecular variation, with important implications for human health and disease. In this study, we used the Doxycycline (Dox) regulated, tetracycline-responsive (Tet-Off) promoter system to sequentially reduce or titrate gene expression levels of the essential yeast transcription factor Repressor Activator Protein 1 (RAP1) similar to a hypomorph allele series, across three distinct environments: Yeast Peptone Dextrose (YPD) media, YPD media with Heat Shock (HS), and Yeast Peptone Acetate (YPAC) media. We then performed RNA sequencing (RNA Seq) to assess global transcriptional responses to RAP1 reduction in these different growth environments. Our analysis first focused on the independent effects of varying RAP1 expression levels within and across environments. We then explored GxE interactions, revealing a subset of genes with significant consequences of reduced levels of RAP1 and environment-specific expression patterns. Notably, many genes exhibited opposite effects of RAP1 titration on gene expression when yeast were grown in YPAC media compared to YPD media and/or HS, suggesting environment-dependent regulatory architecture. This design reveals how cells integrate internal transcriptional and regulatory changes with external environmental cues, providing a deeper view of GxE architecture. Using Weighted Gene Co-expression Network Analysis (WGCNA), we identified co-regulated gene modules, and by combining this with transcription factor motif enrichment tests, our study identified candidate regulators driving their dynamics. Our findings demonstrate that gene regulatory networks can vary dramatically depending on the environmental context an organism experiences, which can then influence the specific phenotypes produced by a particular genetic perturbation. This illustrates the complexity of genotype-environment interactions and the importance of studying gene function in multiple environments to gain a truly comprehensive understanding of a genes sometimes numerous and diverse functions.